OLLIE: On-Line Learning For Information Extraction
نویسندگان
چکیده
This paper reports work aimed at developing an open, distributed learning environment, OLLIE, where researchers can experiment with different Machine Learning (ML) methods for Information Extraction. Once the required level of performance is reached, the ML algorithms can be used to speed up the manual annotation process. OLLIE uses a browser client while data storage and ML training is performed on servers. The different ML algorithms use a unified programming interface; the integration of new ones is straightforward.
منابع مشابه
Open Language Learning for Information Extraction
Open Information Extraction (IE) systems extract relational tuples from text, without requiring a pre-specified vocabulary, by identifying relation phrases and associated arguments in arbitrary sentences. However, stateof-the-art Open IE systems such as REVERB and WOE share two important weaknesses – (1) they extract only relations that are mediated by verbs, and (2) they ignore context, thus e...
متن کاملOpen Information Extraction with Tree Kernels
Traditional relation extraction seeks to identify pre-specified semantic relations within natural language text, while open Information Extraction (Open IE) takes a more general approach, and looks for a variety of relations without restriction to a fixed relation set. With this generalization comes the question, what is a relation? For example, should the more general task be restricted to rel...
متن کاملOpen Information Extraction via Contextual Sentence Decomposition1
We show how contextual sentence decomposition (CSD), a technique originally developed for high-precision semantic search, can be used for open information extraction (OIE). Intuitively, CSD decomposes a sentence into the parts that semantically “belong together”. By identifying the (implicit or explicit) verb in each such part, we obtain facts like in OIE. We compare our system, called CSD-IE, ...
متن کاملON-LINE SOLID-PHASE EXTRACTION AND LIQUID CHROMATOGRAPHY/PARTICLE BEAM-MASS SPECTROMETRY FOR DEGRADATION STUDIES OF SOME POLAR PESTICIDES IN WATER
An on-line automated method for photodegradation studies of isoproturon, diuron, atrazine, fenitrothion, and metoxuron by means of liquid chromatography/mass spectrometry (LC/MS) with particle beam (PB) interface is described. Surface water samples were first spiked with 50 µg/l of each pesticide and then exposed to the radiation of the medium-pressure mercury lamp. Next, in regular intervals o...
متن کاملPhishing website detection using weighted feature line embedding
The aim of phishing is tracing the users' s private information without their permission by designing a new website which mimics the trusted website. The specialists of information technology do not agree on a unique definition for the discriminative features that characterizes the phishing websites. Therefore, the number of reliable training samples in phishing detection problems is limited. M...
متن کامل